CONSORF: a consensus prediction system for prokaryotic coding sequences
نویسندگان
چکیده
CONSORF is a fully automatic high-accuracy identification system that provides consensus prokaryotic CDS information. It first predicts the CDSs supported by consensus alignments. The alignments are derived from multiple genome-to-proteome comparisons with other prokaryotes using the FASTX program. Then, it fills the empty genomic regions with the CDSs supported by consensus ab initio predictions. From those consensus results, CONSORF provides prediction reliability scores, predicted frame-shifts, alternative start sites and best pair-wise match information against other prokaryotes. These results are easily accessed from a website.
منابع مشابه
A Compression-Based Approach for Coding Sequences Identification. I. Application to Prokaryotic Genomes
Most of the gene prediction algorithms for prokaryotes are based on Hidden Markov Models or similar machine-learning approaches, which imply the optimization of a high number of parameters. The present paper presents a novel method for the classification of coding and non-coding regions in prokaryotic genomes, based on a suitably defined compression index of a DNA sequence. The main features of...
متن کاملNext-Generation Annotation of Prokaryotic Genomes with EuGene-P: Application to Sinorhizobium meliloti 2011
The availability of next-generation sequences of transcripts from prokaryotic organisms offers the opportunity to design a new generation of automated genome annotation tools not yet available for prokaryotes. In this work, we designed EuGene-P, the first integrative prokaryotic gene finder tool which combines a variety of high-throughput data, including oriented RNA-Seq data, directly into the...
متن کاملHow to Deal with Small Open Reading Frames?
Current ’classical’ algorithms recognizing protein coding sequences do not work effectively with sequences of small length. To deal with this problem we have proposed some improvements of the existing gene finders without any assumed arbitrary threshold. Introduced parameters describe position of tested sequences in the ranking of all small Open Reading Frames and short protein coding genes fou...
متن کاملIdentification and Functional Prediction of Long Non-Coding RNAs Responsive to Drought stress in Lens culinaris L.
Drought stress is one of the main environmental factors that affects growth and productivity of crop plants, including lentil. In the course of evolution evolution, crucial genetic regulations mediated by non-coding RNAs (ncRNAs) have emerged in plant in response to drought and other abiotic stresses. In the present study, after identifying lncRNAs within the expression profile of lentil, RNA-s...
متن کاملFrameD: a flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences
We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 23 22 شماره
صفحات -
تاریخ انتشار 2007